The GPU Is Not A Magic Wand
So I’m writing a game engine in C++, on top of which I’ll be building a single-player RPG. I don’t have a lot to show yet, just black triangle stuff, but I wanted to share a particular event.
I’ve just gotten to the point where I’ve got the very basics of a rendering pipeline working, and I wanted to see how many polygons I could push, and at what framerate I could push them. To do this, I wrote a very basic mesh class that basically makes a bunch of colored triangles using random coordinates and colors, bundles them all up into a single batch, and sends that batch over to the video card. I enabled multisampling, because I don’t like looking at jagged edges, and I enabled blending, because I wanted to have the triangles be translucent.
Then I fired it up using 200 triangles and it worked like a charm. All the data is on the video card, the CPU should basically be sitting idle, and the video card can just do its thing as fast as it can. And that’s fast, right?
Well, I wanted to see how much it could handle, right? So that’s what I did. I upped the count to 20,000 triangles (with random coordinates and colors, with multisampling set to a 4×4 grid, with blending enabled on every triangle) over to my video card and said “have fun!”
I guess the GPU had fun — it locked up the machine pretty hard, and I think it took me about five minutes until I was able to recover control.
What happened? 200 triangles worked fine. At 2,000 it was a little laggy (I discovered this after the fact). At 20,000 I couldn’t even interact with the desktop long enough to hit ’stop’ in the debugger. Something was wrong; I should be getting far more than 1 frame per second with 20,000 triangles.
Was I accidentally using the software renderer? No, the renderer string says ‘ATI accelerated’.
Was it a problem with using the video card’s memory versus CPU memory? No, it had similarly poor performance with vertex arrays in CPU memory, and even with immediate mode.
Was it a problem with buffering or video depth? No, all that seems to be fine; changing things around had no measurable effect.
Was it the antialiasing or the blending? No, disabling those improved it only slightly.
Was it maybe a problem with the driver or video card itself? No, because tons of games play just fine, and they push way more than 20,000 polygons at once, so it’s got to be something I’m doing wrong.
So what’s the problem? I chased my tail for a while, poking at various options, and then — on a lark — I turned off rendering, so that all of the calls I made didn’t actually do anything. Boom — I went from 1 FPS to well over 1000 FPS.
That was my Houseian moment. Performance increased when I turned off rendering because the GPU was doing too much, and now I was making the GPU do nothing. It was slow before because the GPU was doing too much. But how was that possible? I’m only pushing 20,000 triangles, and modern games push way (way) more than that and still get decent framerates. Therefore, the problem is not that I’m using too many triangles — the problem is that the triangles I’m using are making the GPU do too much. Most of them take up nearly half the screen; the rasterizer has to run through half the screen size nearly 20,000 times in order to draw a frame.
Yes, the GPU is fast, and it can do a lot at once, but it’s not a magic wand. It still has to go through, pixel by pixel, reading and writing colors in the various buffers, and if you make it do that 20,000 times at 1024×768 (supersampled to 4096×3072), it really will have to do about a half billion operations or more. It’s still silicon under there, you know, not an extradimensional pocket of pure graphics.
(I changed the triangles so they were much smaller (about 20×20 pixels each), and ran it again. It was fast and smooth, pushing a framerate that was high enough I didn’t care to remember what it was. I increased the number of triangles up to 200,000 and it was still pretty okay, not nearly 60fps, but still not bad. Of course, this is all without any game logic, mesh deformation, event handling, etc.)
Add a comment to ‘The GPU Is Not A Magic Wand’