How does
Engine22 bring pixels on your screen? How does a game in general draw its
graphics? For me as an unofficial graphics-programmer, it all makes pretty much sense.
But when other people ask about it –including programmers-, it seems to be a
pretty mysterious area. Also, for those who didn’t touch “graphics” the last,
say 10 years, a lot might have changed maybe?
Quite
some years ago, an old friend without any deeper computer background thought I
really programmed every pixel you could possibly see. Not in the sense of
so-called shaders, but really plotting the colours of a monster-model on the
screen, pixel-by-pixel, using code-lines only. Well, thank the Lord it doesn’t work
like that exactly. But then, HOW does it work?
Have a Sprite
Graphics
is a very complex subject, with multiple approaches and several layers. There
is no single perfect way to draw something. Though most games use the same
basic principles and help-libraries more or less. On a global level, we could
divide computer-graphics into 2D and 3D to begin with. Although technically 3D
techniques overlap 2D (you can draw Super Mario using a 3D engine – and many
modern 2D games are actually semi-3D), old 2D games you saw on a nineties
Nintendo, used sprite-based engines.
A sprite
is basically a 2D image. Like the ones you can draw in Paint (or used to draw
when Paint was still a good tool for pixel-artists, modern Paint is useless).
In addition, sprites often have a transparent area. For example, all pink
pixels would become invisible so you could see a background layer through the
image. Also, sprites could be animated, by playing multiple images quick enough
after each other. Obviously the more “frames”, the smoother the animation. But,
and this is typical for the old
sprite-era, computer memory was like Brontosaurus brains. Very little. Thus
small resolution sprites, just a few colours (typically 16 or 256), and just a
few frames and/or little animations in general.
Goro, the 4-armed sprite dude from Mortal Kombat.
When we think
about sprites, we usually think about Pac-Man, Street Fighter puppets or Donkey
Kong throwing barrels. But also the environment was made of sprites. The reason
why Super Mario is so… blocky, is because the world was simply a (2D) raster.
Via a map-editor program, you could assign a value for each raster-cell. A cell
was either unoccupied (passable), a brick-block, a question-mark block, or
maybe water. And again, the background was made of a raster –but usually having
larger cells. Later Mario’s would allow sloped (thus partially transparent)
cells by the way.
So
typically an old fashioned 2D “platform-game” engine gave us a few layers (sky,
background, foreground you walk/jump on) for the environment, and (animated)
sprites for our characters, bullet projectiles, explosions, or whatever it was.
The engine would figure out which cells are currently visible on the screen,
and then draw them cell-by-cell, sprite-by-sprite. In the right order;
background sprites first, foreground sprites last. And of course, hardware of
the Sega, Nintendo or PC provided special ways to do this as fast as possible,
without flickering. Which is terribly slow and primitive for now, but pretty
awesome back then.
Next station, 3D
2D
worlds made out of flat images have one little problem; you can move and even
zoom the camera, but you can’t rotate. There is no depth data or whatsoever.
3D
engines made in the last years of our beloved nineties, took a whole different
approach (and I’m skipping SNES Mode7-graphics for Mario Kart or 2,5D engine like the ones used for Wolfenstein or
Duke Nukem 3D). Whereas 2D “sprites” where the main resources to build a 2D
game, artists now had to learn how to model 3D objects. You know, those
wireframe things. To make a box in a 2D game, you would just draw a rectangle, store
the bitmap, and load it back into your game engine. But now, we had to plot 8
corner coordinates called “vertices”, and connect them by “drawing” triangles. Paint-like
programs got extended with (more complicated) 3D modelling programs, like Maya,
Max, Lightwave, Milkshape, Blender, TrueSpace, et cetera.
A bit
like drawing lines, but now in a 3D space. A (game) 3D model is made out of
triangles. Like the name says, a flat surface with 3 corners. Why is that?
Because (even on this day) we made hardware specialized in drawing these
triangle-primitives. Polygons with 4 or more coordinates instead would also be
possible in theory, but give a lot of complications, mainly mathematically wise.
Anyway, Lara Croft is made out of many small connected triangles. Though 15
years ago, Lara wouldn’t have that much triangles, resulting in less rounded
boobs.
How the
hell does an artist make so many tiny triangles, in such a way that it actually
looks like a building, soldier or car? Sounds like an impossible job. Yeah, it
is difficult. But fortunately those 3D modelling programs I just mentioned have
a lot of special tools. There are even programs like Z-Brush that sort of
“feel” (but then without the actual feel) like claying or sculpting. You
have a massive blob made of millions of triangles (or actually voxels) and you
can push, pull, cut, slice, stamp, split, et cetera. But nevertheless, 3D
modelling is an art on its own. But, unlike my friend thought, 3D modelling is
not a matter of coding thousands of lines that define a model. Thank God –
though there is this exception of insane programmers who make "64k programs" that actually do everything code-wise. But
I’ll spare you the details.
We
didn’t ditch Paint (or probably Photoshop or Paint shop by then) though. A 3D
wireframe model doesn’t have a texture yet. To give our 3D Mario block a yellow
colour and a question-mark logo, we still need to put a 2D image on our 3D
object. But how? In technical terms; “UV mapping”. To put it simple; it’s like
wrapping (2D) paper around a box, putting a decal-sticker on a car, or tattooing
“I miss you Mom” on your curvy arm. UV Mapping is the process of letting each
vertex know where to grab from a 2D image.
3D techniques – Voxels
So far
we explained the art-part; feeding a 3D engine with 3D models (a file with a
huge array of coordinates) and 2D images we can “wrap” around them. But how
about the technical, programming part? How do we draw that box on the screen?
Again,
we can split paths here. Voxel engines, Raytracing and Rasterizing are the
roads to Rome. The paved roads at least. I’ll be short about the first one.
Voxelizing means we make the world out of tiny square… ehm… voxels? They are
like square patches. If you render enough of them together, they can form a
volumetric shape. Like a cloud. Or this terrain in the 1998 “Delta Force” game
series:
The terrain makes me think about corn-flakes, though this "furry" look had a nice-side effect when it comes to grass simulation (something quite impossible with traditional techniques on a larger scale back then).
Although
I think its technically not a Voxel-based engine, Minecraft also kinda reminds
me of it; volumetric (3D) shapes getting simplified into squares or cubes. Obviously, the more voxels we use, the more natural shapes we get.
Only downside is… we need freaking millions of them to avoid that ”furry
carpet” look. Though Voxels are making their re-entrance for special
(background) techniques, they never became a common standard really.
3D techniques – Raytracing / Photon Mapping
Raytracing,
or variants like Photon mapping, are semi-photo realistic approaches. They
follow the rules of light-physics, as Fresnel, Young, Einstein, Fraunhofer or
God intended them to be. You see shit because light photons bounce off on shit
and happen to reach your lucky eye. The reason shit looks like shit is because
of its material structure. Slimy, brownish, smudgy – well anyway. Light photons
launched by the sun or artificial sources like a lightbulb bounce their way
into your eye (and don’t worry, they don’t actually carry shit molecules).
A lot of
physical phenomena happen during this exciting journey. Places that are hard to
reach because of an obstacle, will appear “in shade”, as less photons reach
here. Though they often still manage to reach the place indirectly after a few
bounces (and this is a very important aspect for realistic graphics btw). Every
time a photon bounces, it either reflects or refracts (think about water or
glass), plus it loses some energy. Stuff appears coloured because certain
regions of the colour spectrum are lost. A red wall means it reflects the red
portion of the photon, but absorbs the
other colours. White reflects “everything” (or at least in equal portions),
black absorbs all or most of the energy. Dark = little energy bounced.
Well, I
didn’t pay much attention during physics classes so I’m a bad teacher, but just
remember that Raytracing tries to simulate this process as accurate as
possible. There is only one little problem though… A real-life situation has an
(almost) infinite number of photons that bounce around. Since graphics are a
continuous process (we want to redraw the screen 30 or more times per second),
it would mean we have to simulate billions of photons EACH cycle. Impossible.
Not only the numbers are too big, also the actual math –and mainly testing if
& where a photon collided with your world- is absolutely dazzling. If the
world was rendered with a computer, it would one ultra-giga-mega-Godlike PC!
We’re not even a little bit close.
BUT!
Like magicians, we graphics-programmers are masters of fooling you with cheap
hacks and other fakery. Frauds! That’s what we are. Raytracing doesn’t actually
launch billions of photons. We do a reverse process; for each pixel on the
screen (a resolution of 800 x 600 would give us 480.000 pixels to do), we try
to figure out where it came from. Hence the name ray*tracing*. Still a big
number (and actually still too slow to do it real-time with complex worlds),
but a lot more manageable than billions. Though it’s incomplete… By tracing a
ray, we know which object bounced it off to us. But where did it came from
before that? We have to travel further to a potential lightsource… or
multiples. And don’t forget yet another obstacle might be between that object
and a lightsource, giving indirect light. You see, it quickly branches into
millions and billions of possible paths. And all of that just to render shit.
Shit.
Well,
there you have the reason why games don’t use Raytracing or Photon mapping. And
I was about to put “(yet)”, but it’s not even a “yet”. We’re underpowered. It
might be there one day, but currently we have much smarter fake tricks that can
do almost the same (- must say some engines may actually use raytracing for very specific cases to support special techniques - hybrids).
But it
might be useful to mention how (older?) 3D movies were rendered. If you
remember game-cinematics like those pretty-cool-ugly movies I mentioned in my
previous ”Red Alert” review, you may have noticed the “gritty-spray” look. Now
first of all, movies are different than games, as they are NOT real-time. Games
have to refresh graphics 30 or more times per second to stay fluent. Movies
also have a high framerate, but we can render these frames “offline”. It
doesn’t matter if it takes 1 second, 1 hour, or 1 week to draw a single frame.
If you have two production years, you have plenty of rendering-time. And of
course, studio’s like Pixar have what they call “Render-Farms”. Many computers,
each doing a single frame or even just a small portion of a single frame. All
those separated image-results are put together in the end, just like in the old
days where handmade drawings of Bambi were put in line.
Toy Story must have been one of the first (if not first) successful, fully computer-animated movies.
So that
allows us to sit back, relax, and actually launch a billion photons. Well… sort
of. Of course Westwood didn’t have years and thousands of computers for their
Red Alert movies, nor were the computers any good back then. So, reduce “billions”
to “millions” or something. It’s never enough really, but the more photons we
can launch, the better results. Due limitations or time constraints, especially
older (game) movies appear “under-sampled”, giving that
gritty-pixel-noisy-spray look. What you see here, is just not enough photons
being fired. Surface pixels missed important rays, and blur-like filters are
used afterwards to remove some of the noise.
3D techniques – Radiosity & LightMaps & Baking
A less
accurate, but actually much faster and (nowadays) maybe even nicer technique
when taking the time/quality ratio into account, is baking radiosity lightmaps.
Sounds like something North Korea would do in a reactor, but what we
actually refer to, is putting our camera on a small piece of surface (say a patch of
brick-wall) and render the surrounding world from its perspective. Everything
it can “see”, is also the light it receives. If we do that for “all” patches in
our world, and repeat that whole process multiple times, accumulating previous results, we achieve indirect
light.
But
again, it’s expensive. Not as expensive as photon mapping or raytracing maybe,
but too expensive for real-time games nevertheless. To avoid long initial processing times, we just store our
results to good old 2D images, and “wrap” them on our 3D geometry later on.
Which is why we call these techniques “pre-baked”. An offline tool, typically a
Map Editor, has a bake-button that does this for you. This is also what Engine22
offers by the way.
Only
problem is that these pre-baked maps can’t be changed afterwards (during the game). So it only
works for static environments. Walls / floors / furniture that can’t move or
break. And with static lightsources, that don’t move or switch on/off (though
we have tricks for that).
3D techniques - Rasterizing
Now this
where I initially wanted to be with this Blog post. But as usual, it took me 4
pages to finally get there. Sorry. What most 3D games did and still do, is
“Rasterizing”. And we have some graphical API’s for that; libraries that do the
hard work, and utilize special graphics hardware (nVidia, AMD, …). Even if you
never programmed, you probably heard of DirectX or OpenGL. Well these are such
API’s. Though DirectX does some other game-things as well, the spear point of
both API’s is providing graphics-functions we can use to:
· Load 3D resources (turn model
files into triangle buffers)
·
Load
texture resources (2D images for example)
·
Load
shaders (tiny C-like programs ran by the videocard, mainly to calculate vertex
positions and pixel colours)
·
Management
of those resources
·
Tell
the videocard what to render (which buffers, with which shaders & textures
& other shader parameters)
·
Enable
/ disable / set drawing parameters
· Draw onto the screen or in a background buffer
·
Rasterize
Though
big boys, these graphical API’s are actually pretty basic. They do not make
shadows or beautiful water-reflections for you. They do not calculate if a 3D
object collides with a wall. You still have to do a lot yourself. But, at least
we have guidance now, and utilize 3D acceleration through hardware (MUCH faster).
If we
want to draw our 3D cube, we’ll have to
Or something like that. Drawing usually
includes that we first load & transfer raw data (arrays of colours or
coordinates) towards the videocard. After that, we can activate these buffers
and issue a render-command. Finally, the videocard does the “rasterizing”.
In the
case of 3D graphics, this means it converts those triangles to pixels. A vertex
shader calculates where exactly to put those pixels/dots on the screen. Which
usually depends on a “Camera” we’ll define elsewhere, as a set of matrices.
These matrices tell the camera position, the viewing-direction, how far it can
look, the viewing angle, et cetera. The cube itself also has a matrix that
tells its position, rotation and scale eventually. How & if the cube
appears, is a calculation using those matrices. If the camera is looking the
other way, the cube won’t be on the screen at all. If the distance is very far,
the cube appears small. And so on. Doing these calculations sounds very
complex, and yeah, matrix-calculations are pretty scary. But luckily internet
has dozens of examples, and the videocard & render API will guide you. And
if you use an engine like Engine22, it will do these parts for you most of the
time.
During
the rasterization process (think about an old matrix printer plotting dots on
paper) we also have to “inject” colours. Fragment or Pixel shaders are used for
that nowadays. It’s a small program that does the math. It could be as simple
as colouring all pixels red, but more common is to use textures (the “wraps”
remember?), and eventually lightsources or pre-baked buffers as explained in
the previous part. This is also the stage where we perform tricks like
“bumpmapping”.
Note
these “shaders” weren’t there 15 years ago. The principles were the same more
or less, but these parts of the drawing “pipeline” were fixed functions.
Instead of having to program your own shader-code, you just told OpenGL or
DirectX to use a texture or not, or to use lightSourceX yes/no. Yep, that was a
lot simpler. But also a lot more restricted (and uglier). Anyhow, if you’re an
older programmer from the 2000 era, just keep in mind shaders took over the
place. It’s the major difference between early 2000 and current graphics
techniques. Other than that… some old
story more or less.
Shots from the new Engine22 Map Editor. Everything you'll see is rasterized & using shaders.
So yeah,
with (fragment) shaders my old friend maybe was a little bit right after all,
drawing the scene pixel-by-pixel. Either how, it’s quite different than more
natural (realistic) approaches like photon mapping. We rasterize an object, say
a cube, monster or wall. We plot the geometric shape on the screen –eventually culling
it if something was in front!-, but don’t have knowledge about its
surroundings. We can’t let our pixel-shader check our surroundings to determine
what to reflect, what casts shadows or which lightsources directly or
indirectly pisses its photons on it. This is done with additional background
steps, that store environmental information into (texture)buffers we can query
later on in those shaders. For example, such a buffer could tell us what a
lightsource affects, or how the world is captured at a single point so we can
use it for reflections.
It’s
complex stuff, and moreover, it’s fake stuff. Whether its shadows, reflective
orbs or the way how light finds it way under that table; it’s all fake,
simplified, approximated, guessed or simulated. But so damn smart and good that
a gamer can hardly tell J Though game-engines like Unreal or Engine22 do
a lot more than just graphics (think about audio, physics, scripting, AI, …)
their selling spear-point and major strength is usually their magic box of
tricks there. And as videocards keep getting faster and faster, Pandora’s box
is getting more powerful as well. But remember kids! It’s not physically
correct. Fresnel would punch me three black eyes.
Such great article. In this article you are shared very useful and important information.
ReplyDelete