Monday, December 26, 2011

Take 48, cut

Next post will contain a movie release. Unless I get heartfailure while stuffing myself with Christmas food.

For those who expect another gameplay / spooky horror trailer, I'll have to disappoint you a bit. It's a "Tech-Demo". Which means it shows graphical stuff mainly. The goal is not to please gamers. Screw you, the whole purpose is to attract more 2D / 3D talent :p Ok, ok, we added a little present that might please you as well ;) Nah, the movie could represent a Tower22 part (although T22 does not take place in military looking Radar Stations). It's a little taste of how things *could* look (and even better)... IF we get a team strong, talented and motivated enough to help me through that is.

One problem to tackle with this next movie is the recording quality. Main issues last time were the hellish sounds and overdone blur. Both caused by the combi of Tower22 and FRAPS taking the same CPU resources. Two mangy dogs fighting for a stinky shoe. Two old ladies arguing about the TV channel. Two hoes pulling each other’s (pink)hair for a pimp. Two . Anyway, a year passed by, plenty of time to evaluate and prepare better for a next time. So what did we do? ...Ehr...

Doing homework was never my strongest skill. In contrast to Postponing. So I started sweating again for the recording session. The desktop computer used for recording has a newer video-card, so the good news is that it should have some more firepower this time. Then again, this new demo is quite a lot more advanced than Demo1. More lights, more shading tricks, more textures and objects, et cetera. On top, I didn't try the engine for half a year on that computer. Oh boy.



As expected, the framerate crippled again, but not as bad as previous time. After some optimizations, FRAPS and T22 eventually got in Sweet Harmony. Sort of, at 20 frames per second. That's still not smooth as silk, but already better than the previous time, plus the resolution has doubled as well. Maybe we can squeeze out a few more frames.

A far better approach would be recording the video-card (HDMI?) output with an external device. Like you could record your Super Mario Paint(snes) movies on VHS. Hmmm, maybe I should buy a digital camcorder or something. Let it store the video-card output on a disk, then read the digital file back on the computer. Never tried it, but it sounds like a valid plan. And otherwise you'll have to do with a more jerky movie :p

Still have to pimp a few rooms. Julio takes a snapshot of an existing room, adds some photosoup magic, then passes it back so I can try to achieve the same thing with the engine.


One thing I could solve software-o-matic, was the camera path. Like in real movies, it sometimes takes several cuts to record a scene. Or did you think Commando was recorded in one single shot; Carry tree, smile horrible while daughter puts ice on nose, smell enemies, get down, kill enemies, daughter gone, get captured, kill guy and say he's dead tired, jump out of plane, go to shopping mall, throw telephone cell, drive, chase, crash, throw little guy from a cliff, fuck-you-asshole, infiltrate hangar, window shopper, get busted, get freed, fly to a tropical island, undress, boat, dress & makeup, kill a few with tactical stealth tricks, kill 10 dozens more being absolutely not stealthy, kill with an Uzi, kill with a shotgun, shed fight, get in the villa, shoot gringo warlord, go to the basement, get shot in the shoulder, knife-fight with a fat moustache man in a chain mail (hu?), blow some steam Bennett, found daughter, happy, -fin-. Bang, 90 minutes.

Arnold Schwarzenegger has awesome acting skills, but not so awesome to shoot this movie in a single 90 minute-breath. Hey, not even fucking Commando himself can do that, cause the timeline was spread over 12 hours in the movie. Enough kidding. Of course movies take multiple shots, with multiple camera's. And afterwards, the best parts are pasted together.

With games that use a first-person-camera, this isn't so easy. If the camera suddenly shifts 30 centimeters to the left, or if the lighting suddenly goes a bit darker, the viewer will notice this "discontinuity". Do it too often, and you'll get an epileptic attack. Movies tend to switch the camera viewpoint every 5 seconds or so. This "switching" helps hiding discontinuities. Maybe the actor moved 10 cm to the left, maybe the sun stopped shining, maybe he lost his boner a bit(other type of movie). Or maybe a plane just happened to fly over the Coliseum, roughly 2000 years ago. Neither did I know a camera crew was taking part of the battle in Ramelle. The point is, the viewer doesn't notice because (s)he's focused on the actors, and the camera switching hides faults too. Hence, did you know T22 Demo1 was made out of 2 parts as well? When the player runs the stairs up, a second movie was used at some point. Yep, when the camera switched from third to first person view.


This time I won't be switching camera views, so everything has to be recorded perfectly in one time. And that's difficult! You have to make sure the camera captures all the wanted "hotspots". Not too long, but not too short either. We amateur camera-men tend to record things boringly-long, or in turbo-pace. If you know what you are seeing, focusing on it for 2 seconds seems like an eternity.
But for another viewer that is not familiar with the scene, it takes at least a second to re-focus. I always get dizzy and sick of those MTV shows where the camera crew seems to be on speed. Yo biatch, my kitchen, home cinema for real, here’s where it all happens in the master bedroom, 50 inch rims, now get off my property. 10 billion shots, they sure show a whole lot. I just have no idea what I just saw the last 10 seconds. Too fast. Anyway, finding the right pace for a movie shot is very important. And last but not least, the camera has to follow its path in a somewhat natural, curvy way.

In my case, I have to navigate the camera for ~6 minutes (that's how long the movie takes) without making faults. With a jerky mouse. With FRAPS and T22 arguing for (CPU) resources. With a little kid making noise in the background. With a girl that starts talking about, ...-hell I don't know- while you are sweating and recording the whole thing. You can't do it in 2 or more parts, cause it's just impossible to redo the exact same camera path. So at the stitching points, you suddenly see the camera hopping to another point, the lighting changes, et cetera.

Moving the camera down a stair, with a curve, is difficult enough already. Doing it twice exactly the same is impossible.

Do the loco-motion logger
Hey... But I got a little idea! Nothing new really, yet very useful. What if you lock your family outside, don't run FRAPS, and run a "motion logger"? Yeah, just store the camera view every (few) cycle(s). In my case, I made a little class that stored the camera matrix 15 times per second. Just directly writing it in an opened file-stream. If you are really cool, you make a background thread that fills 2 buffers and store those in a filestream, but that wasn't necessary for the little amount of data in my case. To give you an idea about file-sizes,
---- 5 minutes = 5x60 x 15(fps) x matrix(64 bytes) = 282 kilobyte
That isn't too bad. Otherwise you may store quaternion’s instead of matrices to save more space.

Now that we have the camera path stored, we can redo the navigation with a click on the button. Store all matrices in a gigantic array or better, make a (background) streamreader that fills a buffer with matrices each X seconds. Then calculate the actual camera matrix by interpolating between the current and next frame matrix. The path is 99% identical to the one you recorded earlier. Eventually you can relax and edit parts now, or let the computer calculate a spline to get a more smooth route.

In case your girl started talking about spaghetti or your little brother kicked your legs while recording, causing your mouse-hand to do make a jerky movement, it's relative easy to polish. Let the camera follow the path till the point where things went wrong. Stop it, hit the capture button again, and now do a second attempt. The newer part should fit seamlessly with the previous one.


When you're finally happy with the camera route, you can finally record the movie for real. Just let the camera roll over the railtrack again, sit back, and relax. Even when the video capture program causes a jerky framerate, the camera will keep stable cause no human input is needed anymore. What a relief.

Sunday, December 18, 2011

Turbo Boosters

One visitor asked; "Dear Santa. Where do all those magical presents(effects) in your T22 world come from? And how do you prevent the sleigh from crashing down when loading more and more of them? Yours truly, Nick, 7 years old, Nebraska.'


Well boys and girls. If you think the Tower22 engine ("Engine22") is fast as hell, ho-ho-ho, no. To the facts; currently the "Radar Station" maps run at a pounding ~17 frames per second on my laptop. Of course that depends a bit on the view. When staring at an empty corner, the speed may be an acceptable 28 fps. When looking at multiple rooms that also contain a dense cloud of litten particles, the laptop switches over to Burger-King-gravy speed: ~12 or even less.

The laptop is getting old though. 32 bits, and moreover, "just" a GeForce 9800M GTS. A game like Crysis Warhead doesn't run smoothly either, though faster than this. My other desktop computer gets far better results. It's also a 32-bit dual-core relic from the past, but with a better video card (EVGA GeForce 4700 GTS). The first Demo movie for example ran about 25 frames per second on the laptop (the dark empty corridors faster though). And near to 60 fps on the desktop. Another comparison. The "Radar Station" we're working on now (almost finished!) started at 25 fps on the laptop. And now after adding more textures, more effects, more lamps and more objects, the framerate dropped to the ~17 mentioned earlier. No idea how it will behave on the desktop, but I guess it will be "acceptable". Well it better damn be! Otherwise we'll get that annoying motion-blur again due the relative large cycle intervals ;)

Compare this to this. At least the framerate dropped for a good cause :)


For gaming, such a low framerate sucks ass of course. Ever played Doom1 on a 386? You wanted to play the game so badly so you kept pushing, but the PC really said “please no! Stop it!”. For development though, a lower speed is "ok". I'm not playing, just flying through the maps to check coding adjustments or to decorate the maps. That's also the reason why I barely touch the much faster desktop computer. With a laptop, you can at least code in bed, in bath, on the toilet, while being kidnapped by al Qaida, or wherever you are. So, I prefer mobility over speed in this case.

Also have to notice that "dropping performance" happens in a gradual way. It's like getting fat; you don't gain 10kg after a week of snacking (unless you drank 6 kegs of beer in a mug like this -blurp- ). It's not that the framerate suddenly cripples when adding effect-X. Nah, usually it just slows a tiny little bit. You barely notice it, so, nothing to worry about. But all those tiny bits together...



As said, the performance cannot be compared to commercial engines. Simply, I have too little time and less experience to make it lighting fast. Then I don't worry that much about it either. It will still take years to finish this game, so performance tweaking right now is a bit of a waste of time. 2 years later there will be faster hardware, plus some of the techniques we're using might be replaced by then. Especially things like (ambient) lighting And the parallax effect showed earlier also keeps getting smarter, faster algorithms. So, I'll just take a look at some of those techniques in a few years from now again. Well, in fact, 1 or 2 graphics-programmer experts can help by then. Optimizing shaders, give professional advice, find bottlenecks, et cetera. But at this moment, maximum performance is certainly not a priority.

No, the damage on those tiles wasn't there first. It's just a flat decal, faking depth with Parallax Occlusion Mapping...

Nevertheless, that does not mean we just blindly adding features. First of all, an effect needs to be worth it. There are dozens of papers that show awesome stuff. The next best things to sliced breath... ahum, in theory. But more than often they come at a high cost, and/or require a very specific implementation that may restrict other things in your rendering pipeline. Take BRDF's for example. By using real-world sampled data and specific formula's, you can create a (far) more realistic look on your materials. Gold doesn't reflect light in the same way as velvet, concrete or a freakn grapefruit for that matter. Yet most games treat all the surfaces with the standard "Blinn" or "Phong" lighting model (which explains the plastic or metal look on softer surfaces such as skin or wallpaper). Why not using BRDF's then? Well:

- Hard to fit in a deferred/inferred rendering pipeline (cause the lamps have to check on each pixel what kind of material they were made from, then have access to the BRDF parameters somehow)
- Only 0,1% of the artists knows how to make BRDF (image) data. And most of the programmers don't know how-to either.
- And even if they did, it requires fucking professor hardware for the BRDF acquisition. You can't just draw them by hand you know...

- As for the end-result (thus with several lamps, blurs, SSAO, ambient, reflections, post-filters and a ton of other effects applied), an untrained viewer barely notices the difference between a concrete wall rendered with Phong or Oren Nayar.

So is it worth it... Hmmm... Eventually probably yes. But as long as we need the performance elsewhere, techniques like BRDF, or slightly sharper shadows are usually exchanged for more noticeable things such as sexy blurs or being able to put more objects in the scene. The Parallax effect is discussable too. Google for "Parallax Occlusion Mapping", and you'll see that the coolest screenshots always placed the camera right next to a brick wall or on the floor. But how many times will you see surfaces from that close really? When you got killed and fall onto the ground maybe, but otherwise the parallax effect, and certainly the subtle "self-shadowing" effect is hard to notice. Yet it comes at a very high cost. That's why we only use it for rough surfaces with obvious height differences. It's a pretty cool effect, and hardware starts to allow it. But if you run out of rendering-power, POM would be one of the first things to ditch I'd say.

Cool, POM and shit(Parallax Occlusion Mapping). But do you still see the effect now? Same wall (below the sink), but zoomed back to a more realistic viewing distance...

Second things about maintaining our performance. When I do add an effect, I'll try to do it in a smart way. GPU's are powerful but hate to get interrupted. It's not like a multi-tasking woman that can phone, cook, chat, watch the kids, and clean the house at the same time (at least she thinks she can). Best is to create batches and do as much stuff as possible in one call. Sort out data, try to render bigger chunks with the least "state changes" as possible. That's pretty hard these days with so many different effects being pre-processed in the background. ShadowMaps need to be prepared, Animated objects such as cloth or characters may need to stream their data to a VBO or texture first, you'll need depth, velocity and all kinds image data before you can render SSAO or motion blur, and so on. What you see in the Tower22 screenshots is only a small portion of what really happens on the GPU. Before it starts rendering screen content, it draws shadowMaps, cubeMaps, mirror(for water reflections), and plenty of other things.

To prevent getting slower than a tranquilized panda, these steps are combined if possible. With MRT (Multi Render Target) different types of data are rendered to different textures, but in the same pass. Which means I only have to call the GPU to draw the world geometry once. I also sort the steps on screen resolutions so that the GPU has to less switching between Frame Buffer targets on different resolutions. Just a grasp.

Yet another trick to speed up is sparing bandwidth by using compressed textures, or keeping the geometry as much as possible on the videocard itself with VBO's (also for skinned/animated things). The latter means that the triangles and additional vertex data for, let's say a monster-model, is already present on the videocard. So we don't have to re-send all the triangles from the RAM memory to GPU memory (via the CPU) each cycle again.



Texture with particles, rendered in the background on a lower resolution

Whatever you do, don't just do it without thinking. Often there are multiple ways to achieve the same thing. So before I just implement something, I usually ask around (on Gamedev for example) for some tips. One last example. Particles. The old fashioned way is to render a large amount of (transparent) points or billboard sprites(quads) into your world. A whole bunch of them forms a cloud, bloodspray, smoke, puke, waterfall or whatever you had in mind. But because so many quads are drawn on top of each other, the GPU easily suffers from fillrate problems. The same screenpixel gets treated dozens of times, and that hurts. One trick to reduce the amount of calculations is to draw the particles offline, on a smaller texture. Then in the end, draw the texture that contains all particles on top of your scene. Since the offline texture has a lower resolution, less pixel calculations have to be done. Of course, the quality(texture sharpness) also drops but for foggy/blurry stuff like fart gasses or smoke, this is actually a good thing. Two portions of happiness for the price of one.


Don't know what speed the desktop computer will reach when we record the next Tech-Demo movie. But at the end of the story, I'm focused on making Tower22. Not a super engine. Let's first just build a car that drives properly, rather than a car that drives 300 miles per hour. Besides, let those lazy Silicon Valley’s bake better video-cards instead :p Nah, of course, you can't totally neglect the performance in a game engine. If that car needs to drive faster than 150 mph eventually, you shouldn't start with a Fred Flintstone chassis prototype. But squeezing out milliseconds will be future work.

Thursday, December 8, 2011

Echo

Sorry for the lack of fairy tails last weeks... Let's say we're very, very busy finishing the demo movie! And playing Zelda ;)

A lot of stuff has been added. Shader improvements, Parallax Occlusion Mapping, objects, morphing animations, and... I lost the count. Anyway, the least thing I can do is post a few screenshots then.


Waterpool decals