When to Optimize

When to Optimize

The First Rule of Program Optimization:
Don’t do it.

The Second Rule of Program Optimization (for experts only!):
Don’t do it yet.

— Michael Jackson

I just took one of my projects off of the back burner this weekend and hacked it up. Evolution by natural selection involves differential reproductive success, that is, the ones who have some sort of advantage will leave more offspring than those without the advantage. But how big an advantage do you need? Obviously, if you’re a panther with sharp claws that allow you to hunt twice as many sheep (or whatever panthers eat) than the panther next to you, that’s an advantage, and you’ll have more kids than he will.

But biologists say that even a 1% difference is pretty big. Intuitively, that doesn’t sound right. If you’re a fly that has laid a million eggs, and 100 of them survive to reproduce, while the fly next to you has also laid a million eggs but only 99 of them survive, what the hell difference does that make? But intuition is often wrong, so I set out to test this with some numbers.

To that end, I wrote a quick and dirty simulation, in Perl, natch. The basic outline should be familiar:

initialize population:
    create 999 individuals with normal genome "aa";
    create 1 individual with mutant heterozygotic genome "Aa";

for (generation = 0; generation < 1000; generation++)
{
    for (i = 0; i < POPULATION_SIZE; i++)
    {
        mother = choose_parent();
        father = choose_parent();
        choose, at random, one of the chromosomes from the mother and
            one from the father to make the child;
        add the child to the new generation;
        population[] = new_population[];
    }
}

choose_parent()
{
    /* Choose a parent at random from the current population; the choice
     * is weighted by the genome of each individual: normals (with no A
     * genes) have a weight of 10000; heterozygotes (with one a gene and
     * one A gene) have a weight of 10100; advantaged homozygotes
     * (with two A genes) have a weight of 10200.
     */
}

This script was taking tens of seconds to simulate 1000 generations with a population size of 1000. It was obvious that it was going to be too slow for large populations, so I rewrote it in C.

Lesson 1: Perl sucks at number-crunching.

If you need to pick a random element from a list, such as a random line from a file of fortune cookie messages, the obvious way to do it is to load the entire file into memory, count the lines (call it N), pick a random number between 1 and N, and output that line.

But with large files, this takes a lot of memory, so I’ve been using a more elegant, one-pass approach:

n = 0;
do {
    read a line;
    n++;
    if (random(n) == 0)
        fortune = the line that was just read;
}
print fortune;

The way this works is that you have a 1 in 1 chance of picking the first line, a 1 in 2 chance of replacing the current line with line 2, a 1 in 3 chance of replacing the current line with line 3, and so forth. The way the math works out, you have a 1 in N chance of printing any given line. The cool thing about this technique is that you don’t have to count the lines ahead of time, and you only need as much memory as it takes to hold the longest line.

For a weighted choice, where lines with a heavier weight are chosen more often, use a variant of the above:

tot_weight = 0;
do {
    read a line;
    w = weight of current line;
    tot_weight += w;
    if (random(tot_weight) < w)
        fortune = the line that was just read;
}
print fortune;

This is what I was using to pick random parents (the choose_parent() function, in the pseudocode above). But eventually it occurred to me that I already had the entire population in memory, and knew how many elements I was choosing from, so the advantages of the one-pass technique weren’t helping me.

So I replaced that approach with a different one:

  1. Calculate the total weight W of the entire parent population.
  2. Choose a number R between 0 and W-1
  3. Traverse the parent population, adding up the weights, until this running total is greater than R
  4. Pick the member on which you stopped

This approach is O(n), just like the one-pass trick, but only calls random() once, instead of once per element. I still got a significant performance improvement.

Lesson 2: random() is slow

At the same time, I rewrote the code that chooses a random chromosome from the father and a random chromosome from the mother: instead of calling random() for each parent, I realized that there are only four options, so I called random(4) once, and used a switch statement to pick the right one. That also helped a bit.

I also added the weight of the individual inside the data structure that held its genome, just so I wouldn’t have to recompute it every time by looking at its genes. That didn’t help much, but hey.

But the business of counting the cumulative weight and choosing a random parent got me thinking: if the cumulative weight were also in the data structure describing an individual (the cumulative weight being the sum of the individual’s weight, and the weights of all of the individuals to its left), that cumulative weight would be a number monotonically increasing along the population array. So choosing a random element is really: pick a number R less than the total weight; find the element with the smallest cumulative weight that’s larger than R. But since the cumulative weight increases monotonically, the array is already sorted, which means we can use efficient searching.

For those who didn’t take CS in college, the technique is similar to looking up a word in the dictionary: you could go through it page by page, entry by entry, looking for the word you want to find, but that would take forever. So instead, you open the dictionary in the middle and look at the first entry on the page. If the word you’re looking for comes before that, you can ignore the entire second half of the book; if the word you’re looking for comes after, then you can ignore the entire first half of the dictionary. So take the half you still need to search (let’s say it’s first half), and open to the page in the middle, i.e., the one one quarter of the way through the dictionary, and look at the first entry. You can see whether the word you’re looking for comes before or after that, and can therefore eliminate half of the remaining pages. Continue in this way until you’ve found the entry you’re looking for.

r = random(max_weight);
min = 0;
max = POPULATION_SIZE-1;
while (min != max)
{
    i = (min + max) / 2;
    if (r > population[i].cumulative_weight)
        min = i+1;
    else
        max = i;
}
choose populaton[min];

I knew that this O(ln(n)) algorithm was faster than the O(n) one I’d been using so far, but this time I actually ran a benchmark, with a test rig to get accurate results. With just this change, the execution time of the program went from about 66 seconds to 3.5 seconds.

Lesson 3: Faster algorithms actually do run faster, for sufficiently-large N

Lesson 4: Hey, I actually did learn something useful in college!

My current desktop machine is faster than any other I’ve owned, so it’s easy to fall into the “just throw faster hardware at it” frame of mind. And I normally write in Perl because most of the time, I’m optimizing for developer time, i.e., I just want to write a quick hack for just this one thing. By the same token, the code I write is usually only meant to run once, so it doesn’t matter if it takes 10 seconds to execute instead of 5, especially if it’ll save me half an hour of development. But in this case, it turns out that it’s worth the effort of optimizing. As I said, the original Perl script simulated a population of 1000 in a few tens of seconds. The new version is simulating populations of 1,000,000 at a rate of about one every two seconds.

Lesson 5: Every once in a while, it’s worth optimizing.

One thought on “When to Optimize

  1. I had a problem regarding rand() optimization too. I’ve decided to choose Knuth fast PRNG as I didn’t need a cryptographic level of security.
    This code generate random number by block of 55 numbers in an array.
    for (i=0; i<55; ++i) rsl[i] = rsl[i] + rsl[(i+24) % 55];

    I wrote a more complete story on my example of random optimization on my blog, but it seems that your spam blocker don’t want me to post the url 😉

  2. Jok:
    Good point. I probably should’ve pulled out my copy of Knuth (then again, I always get annoyed at his habit of presenting examples in assembly language).

    As for the link, sorry, but spammers have been a real pain.

Comments are closed.