Herein we describe an optimization technique similar to putting a processing loop into a background thread, but with (arguably) better efficiency, reliability, and giving control directly to the owning thread (rather than indirectly via operating system prioirity calls.)
@p
Ahem, got that?
Let's define a simple method, it works on a 2D image, called process_image (image* I):
void process_image (image* I) {
@p for (int y = 0; y < I->h; ++y) {
@p for (int x = 0; x < I->w; ++x) {
I->set_pixel (x, y, raytrace (x, y));
}
@p }
@p }
Here the "raytrace" method isn't important-- it's just something complex that we need to do.
Next, we imagine this little routine running inside your main game loop.
@p
For instance, update () calls sprite::update () for something, which then checks if it has an image, and if not it calls process_image (I) on it's own bitmap in order to raytrace itself (nevermind the specifics of whether or not this is sensible.)
@p
This only happens once, let's say, when a sprite first is loaded, or once in awhile, say when it changes it's animation.
@p
But there is a problem-- calls to process_image () take way too long!
@p
In particular, the framerate hiccups noticably whenever we create a new sprite. The rest of the time, the game runs smoothly, but here we have a complicated thing to do, and it's all happening within one frame, and there just isn't enough space in that frame, in realtime, to avoid a hiccup.
Computers run in sequential time, making them run in real time is an interesting and important problem for games.
@p
One way to address a problem like this, is amortize.
@p
When we say "amortize" what we mean is, something that would normally cost a whole bunch all at once, well, divide it out across a longer period of time. In our case, we aren't calculating the cost of a new bagel machine over it's expected lifetime of 36 months, instead we are spreading the cost of process_image () over, say, 36 frames.
Here's how.
Typically our variables "cur_x" and "cur_y" below would exist inside a class declaration, i.e., be object members, but let's ignore that for a second to make this example simpler:
int cur_x, cur_y;
@p int state = 0; // 0 == none, 1 == processing, 2 == done
@p int process_image_partial (image* I, int Cycles) {
@p if (0 == state) {
@p // state 0 means, "initialize things"; if we set state to 0 and then call this
@p // method, it will initialize whatever member or global variables it needs, and
@p // set itself up to start processing.
@p cur_x = 0;
@p cur_y = 0;
@p state = 1;
@p }
@p else if (1 == state) {
@p // this is the "processing" state; start where we left off (cur_x, cur_y) and iterate
@p // as long as we have cycles to do it.
@p bool BreakFlag = false;
@p for (int y = cur_y; y < I->h; ++y) {
@p for (int x = cur_x; x < I->w; ++x) {
@p // check if we can process any more cycles, and then decrement;
@p // it's best to do it before processing, and in this order, or
@p // we will end up processing the same pixel twice
@p if (Cycles <= 0) {
@p // save so that next frame we start off here
@p cur_x = x;
@p cur_y = y;
@p BreakFlag = true;
@p break;
@p }
@p --Cycles;
@p // process this pixel, assuming we have cycles
I->set_pixel (x, y, raytrace (x, y));}
@p if (BreakFlag) break;
@p // we finished a line; the next one should start at 0
@p cur_x = 0;
@p }
@p // if we made this far, and didn't break, it means we are done!
@p if (!BreakFlag) {
@p state = 2; // set state to finished, so calling routine can tell
@p }
@p }
@p // it is much more convenient to just return unused cycles, and have our calling routine
@p // call us a number of times until they are used up, than to try and write this routine
@p // so that it always uses all cycles. The calling routine can normally just ignore any
@p // unused cycles, anyhow, if it's not worried about taking a few extra frames to finish.
@p return Cycles;
@p }
The calling routine for this would look something like this:
function sprite::update (double T) {
@p // presumably, something happens (e.g., our creation) which necessitates us to call
@p // process image; if so, set it up here
@p if (needs_process_image) {
@p // reset it, no matter what's going on; pretend that state, cur_x, and cur_y are
@p // now members of sprite, not globals.
@p state = 0;
@p process_image_partial (image, 1);
@p needs_process_image = false; // clear this out, it's been initiated
@p }
@p // are we processing image? just raytrace 10 pixels per frame
@p if (1 == state) {
@p process_imgae_partial (image, 10);
@p }
@p // check to see if we finished
@p if (2 == state) {
@p do_something_useful_with_image_now_that_image_is_complete_and_ready (image);
@p // reset state so we don't keep executing this little section.
@p state = 0;
@p }
@p }
This code is complex, but the benefit is huge.
BENEFITS
The complexity can be somewhat mitigated by breaking the process_image_partial () method or similar into multiple function calls. You can sort of avoid using object state, if that's a burden, by just passing the necessary variables in as parameters, but you'll still need that object state to be somewhere to keep that information on where you are at stored frame-to-frame.
@p
Finally, there is just no real way around it-- each time you use it, you'll likely need to write it from scratch. Each situation will require you to think about it fresh. For instance, would it be enough to have each row of each image be considered a "cycle", rather than each pixel? You could save yourself some work.
It's very rewarding to implement this and see framerate go from a hiccupland to butter-smooth.
@p
It's not the right approach for most slowdown situations, of course. I'd recommend optimizing in other ways, first, as a general rule. And if you are waiting for file or other IO, use an actual thread (don't use this to load data from files or get network data, please.) But if you hadn't thought of this technique before, keep it on hand for when you need it.
@p
Amortization is a genuinely useful, powerful and effective technique. It's a specialized tool, and is somewhat difficult to employ-- but it works very, very well, when you need it.
2012-09-28